Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(security): add security config to disable it #1498

Merged
merged 9 commits into from
Jan 2, 2025
Merged

Conversation

ArslanSaleem
Copy link
Collaborator

@ArslanSaleem ArslanSaleem commented Jan 2, 2025

Important

Adds a security configuration option to control security checks in code execution, updating logic and tests accordingly.

  • Security Configuration:
    • Adds security option to Config in df_config.py with values standard, none, advanced.
    • Updates chat() in base.py to check security level before raising MaliciousQueryError.
    • Modifies get_environment() in optional.py to conditionally use restricted libraries based on secure parameter.
  • Code Execution:
    • Updates get_code_to_run() in code_cleaning.py to check security level before raising MaliciousQueryError.
    • Updates execute_code() in code_execution.py to use secure parameter in get_environment().
  • Tests:
    • Adds tests in test_optional_dependency.py to verify environment setup with different security levels.
    • Updates tests in test_code_cleaning.py to handle new security configuration.

This description was created by Ellipsis for 2363087. It will automatically update as commits are pushed.

@ArslanSaleem ArslanSaleem requested a review from gventuri January 2, 2025 11:37
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jan 2, 2025
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to cc73f91 in 38 seconds

More details
  • Looked at 292 lines of code in 7 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. pandasai/agent/base.py:262
  • Draft comment:
    Consider storing the security level check in a variable to avoid repeated evaluations. This pattern is repeated in other files as well.
  • Reason this comment was not posted:
    Confidence changes required: 20%
    The code checks for security levels 'standard' and 'advanced' multiple times. It would be more efficient to store this check in a variable and reuse it.
2. pandasai/pipelines/chat/code_cleaning.py:124
  • Draft comment:
    Consider storing the security level check in a variable to avoid repeated evaluations. This pattern is repeated in other files as well.
  • Reason this comment was not posted:
    Confidence changes required: 20%
    The code checks for security levels 'standard' and 'advanced' multiple times. It would be more efficient to store this check in a variable and reuse it.
3. pandasai/pipelines/chat/code_execution.py:158
  • Draft comment:
    Consider storing the security level check in a variable to avoid repeated evaluations. This pattern is repeated in other files as well.
  • Reason this comment was not posted:
    Confidence changes required: 20%
    The code checks for security levels 'standard' and 'advanced' multiple times. It would be more efficient to store this check in a variable and reuse it.
4. pandasai/agent/base.py:267
  • Draft comment:
    The error message is too verbose. Consider simplifying it for better readability.
                    "The query contains potentially unsafe references to system modules or methods."
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_tNhKMNiyIBqT0qMu


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on fdb7985 in 38 seconds

More details
  • Looked at 13 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. pandasai/safe_libs/base_restricted_module.py:10
  • Draft comment:
    Consider rephrasing the error message for clarity:
f"Security risk detected: Usage of '{arg}' is prohibited."
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The error message on line 11 could be improved for clarity and professionalism.

Workflow ID: wflow_QyIrtA2K5lz5I4VV


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

pandasai/safe_libs/base_restricted_module.py Show resolved Hide resolved
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 8606ea3 in 1 minute and 2 seconds

More details
  • Looked at 44 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_X58lWLTaP2stASqI


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

tests/unit_tests/agent/test_agent.py Outdated Show resolved Hide resolved
tests/unit_tests/agent/test_agent.py Show resolved Hide resolved
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on d5eaa65 in 29 seconds

More details
  • Looked at 20 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. tests/unit_tests/agent/test_agent.py:715
  • Draft comment:
    Consider adding comments to clarify the purpose of the test cases for both malicious and safe queries.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The test test_query_detection_disable_security is missing comments for both positive and negative cases, which can make it less clear for future developers.
2. tests/unit_tests/agent/test_agent.py:734
  • Draft comment:
    The error message is not grammatically correct. Consider rephrasing it for clarity:
            The query contains references to the 'io' or 'os' modules or the 'b64decode' method, which can be used to execute or access system resources in unsafe ways.
  • Reason this comment was not posted:
    Comment was not on a valid diff hunk.

Workflow ID: wflow_2c6tjGKWkUwFO5PN


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 00f35ab in 34 seconds

More details
  • Looked at 18 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_L1l52MzOELJsNfw3


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 21b9ba3 in 24 seconds

More details
  • Looked at 23 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_R1DVgXIkwENKU69v


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link

codecov bot commented Jan 2, 2025

Codecov Report

Attention: Patch coverage is 60.60606% with 13 lines in your changes missing coverage. Please review.

Project coverage is 78.54%. Comparing base (cfeb071) to head (2363087).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pandasai/helpers/optional.py 51.85% 13 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1498      +/-   ##
==========================================
- Coverage   78.62%   78.54%   -0.08%     
==========================================
  Files         157      157              
  Lines        6311     6325      +14     
==========================================
+ Hits         4962     4968       +6     
- Misses       1349     1357       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 2363087 in 15 seconds

More details
  • Looked at 12 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. docs/library.mdx:225
  • Draft comment:
    Consider rephrasing the description of the security parameter for clarity:
- `security`: The "security" parameter offers three levels: "none," "standard," and "advanced." "Standard" and "advanced" are useful for detecting malicious intent and preventing harmful code execution. By default, "security" is set to "standard." Note that stricter security checks might flag benign queries as harmful. You can disable security by setting it to "none."
  • Reason this comment was not posted:
    Confidence changes required: 33%
    The description of the security parameter is clear and concise, but it could benefit from a slight rephrasing for better readability.
2. docs/library.mdx:225
  • Draft comment:
    Ensure consistent use of quotation marks around 'security'.
  • Reason this comment was not posted:
    Confidence changes required: 50%
    The use of quotation marks around the word 'security' is inconsistent. It should be consistent throughout the document.

Workflow ID: wflow_wL9qGKN5QT0GIlp3


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@gventuri gventuri merged commit 554a638 into main Jan 2, 2025
9 of 11 checks passed
gventuri added a commit that referenced this pull request Jan 8, 2025
* fix: fixed duplicate export dynamics (#1399)

* chore: minor updates in the README (#1400)

* docs: update library.mdx (#1406)

Add 's' in the end of keyword "clarification_question"

agent.clarification_question('What is the GDP of the United States?')

* fix[#1415]: using torch 2.2.0 in macOS x86_64 (#1417)

* Add /app to PYTHONPATH

* fix for issue #1415

---------

Co-authored-by: Giuseppe Coco <[email protected]>

* chore: add /app to PYTHONPATH (#1416)

Co-authored-by: Giuseppe Coco <[email protected]>

* docs: update llms.mdx (#1410)

Add documentation of how to use pandasai.json file

* fix[output_format]: accept dataframe dict as output and secure sql qu… (#1432)

* fix[output_format]: accept dataframe dict as output and secure sql query execution

* fix: ruff errors

* chore[Security]: restrict libs to allow specific functionalities (#1429)

* chore[Security]: restrict libs to allow specific functionalities

* remove: extra lib handling

* fix: ruff errors

* fix: error message for bad import

* fix: add io library in the blacklist

* docs: update docs about customer whitelisted dependencies

* Release v2.4.0

* fix: update last_code_generated in smart_dataframe's __init__.py (#1484)

In SmartDataframe class the last_code_generated property returned _agent.last_code_executed instead of _agent.last_code_generated. In SmartDatalake it is implemented properly.

Co-authored-by: Bence Kecskés <[email protected]>

* Release v2.4.1

* fix: docker-compose-npm-error (#1486)

-Simplified the type of children to just React.ReactNode, which is the standard type for React components' children prop.

* fix: remove plt.show if exists in the generated code (#1501)

* fix: make seaborn as an optional dependency (#1500)

* fix: make seaborn as an optional dependency

* fix: linting errors

* fix: check if whitelisted lib is actually exists in the additional deps (#1499)

* feat(security): add security config to disable it (#1498)

* feat(security): add security config to disable it

* fix: linting errors

* fix(safety): push exact match for get attributes

* add additional test case

* fix: test case

* fix:  linting errors

* fix: linting errors

* docs(config): update config doc to add new config attribute

* Release v2.4.2

* fix(test_cases): handle and clean test cases of pandasai

* ruff changes

* add poetry lock file

* fix github ci workflow

* fix: extension deps installation

* fix: github workflows

* fix: unnecessary comments

* fix: ruff errors

* fix: ruff errors

* fix: typos

* fix ci yml file

* fix: ci

* fix make file for hardcoded env

* fix make file for hardcoded env

* fix make file for hardcoded env

* fix make file for hardcoded env

* fix: imports in extensions

* fix(testcases): extension test cases fixed

* fix: ci clear cache and suggested improvements

* clean up before running CI

* fix ci pipeline

* fix ci pipeline

* fix ci pipeline

* fix ci pipeline

* fix ci pipeline

* fix: ci

* fix: ci

* fix: ci

* fix github bug

* fix: CI

* fix: CI

* fix: test case on windows

* fix: test case on windows

* fix: test case on windows

* fix: test case on windows

* fix: CI

* feat(biqquery): add test cases for big query connector

* fix env issue of pandas not found

* fix: ci use pyproject

* fix: lock file

---------

Co-authored-by: AlessandroMarc <[email protected]>
Co-authored-by: Smoothengineer <[email protected]>
Co-authored-by: Muhammad Adam <[email protected]>
Co-authored-by: giuseppe-coco <[email protected]>
Co-authored-by: Giuseppe Coco <[email protected]>
Co-authored-by: Gabriele Venturi <[email protected]>
Co-authored-by: bencekecskes <[email protected]>
Co-authored-by: Bence Kecskés <[email protected]>
Co-authored-by: Charis Nikolaidis <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants